This notebook shows:
As of this writing, (September 2017), we have delivered CETB data product for the AMSR-E, six SSM/Is and 4 SSMIS sensors, and SMMR. Find out what the latest publicly-available CETB data from NSIDC are at this location:
http://nsidc.org/data/nsidc-0630
For this notebook, I have included some v1.2 SSM/I F08 data to the data subdir. If you wish to run this notebook on your local machine and/or with more data, you may have to adjust the paths and/or filenames.
To run the commands in this notebook yourself, will need to have installed the cetbtools package. The cetbtools package requires python3 and can be installed in a fresh conda environment by doing the following:
1) install miniconda on your system, from https://conda.io/miniconda.html
2) create a new python 3 conda env:
conda create -n cetb python=3
3) set up your conda channels in the correct priority order (the important thing is that conda-forge has higher priority than the defaults channel):
conda config --add channels conda-forge
4) activate the environment and install cetbtools and jupyter:
source activate cetb
conda install -c nsidc cetbtools
conda install jupyter
5) from the activated environment, start the jupyter notebook server:
jupyter notebook
Begin by setting up inline matplotlib display, and importing the netCDF4 reader package:
(Don't worry if you see a warning about font cache on your system, you can safely ignore it.)
In [1]:
%pylab notebook
import matplotlib.pyplot as plt
from netCDF4 import Dataset, num2date
import numpy as np
import os
from pylab import rcParams
rcParams['figure.figsize'] = 12, 10
In [2]:
%cd ~/ExploringCETB/data
%ls *.nc
The User Guide tab on the data page at NSIDC:
http://nsidc.org/data/nsidc-0630
has a section called "File Naming Convention" to help understand these filenames.
Choose a file to open, and display the file-level global attributes. These include information like the product version number, the source swath files used to derive this CETB file, the date the CETB grid was produced, and discovery metadata like spatial and temporal coverages:
In [3]:
filename = "NSIDC-0630-EASE2_T3.125km-F08_SSMI-1987305-37H-A-SIR-CSU-v1.2.nc"
f = Dataset(filename, "r", format="NETCDF4")
f
Out[3]:
NetCDF files contain metadata and several kinds of variables. "Dimension" variables are special, in that the name of the variable and the size of the variable are called the same thing. For example, in the last couple lines of output here you will see "time(time)" and "y(y)" and "x(x)". These are the dimension variables for the CETB data.
Note that this file contains:
The crs variable contains "coordinate reference system" information about the projection used for this file. The geospatial_x_resolution and geospatial_y_resolution tell you about the grid resolution of the data.
To read the crs variable, use:
In [4]:
crs = f.variables['crs']
crs
Out[4]:
We have encoded the projection information in several ways in this file, including:
If you are looking for the map coordinates (in meters) of the upper left corner of the grid, you can obtain them from the valid_range attributes of the x and y dimension variables, like this:
In [5]:
print(f.variables['y'], f.variables['x'])
The left edge of the x dimension is (in meters) -17367530.44. and the top edge of the y dimension is 6756820.2, so the projected map coordinates of the upper left corner of the upper left cell is [x, y] = [-17367530.44 m, 6756820.2 m]
In [6]:
data = f.variables['TB'][:]
print(np.shape(data), np.amin(data), np.amax(data))
We have stored each data array with a single time dimension for the date in order to facilitate netCDF tools that allow data across many files to be concatenated by date. To work just with the 2D array of data, you can use the numpy "squeeze" function like this (notice that the degenerate first dimension for time has now been removed and you are left with a 2D array):
In [7]:
data = np.squeeze(data)
print(np.shape(data))
You can get the date of the data in this file several ways. The easiest, human-readable form is in the file attribute "comment". At this point the comment attribute only has the date in it, but over time the comment variable might get more difficult to parse, since tools like NetCDF Command-Line Operators often append to this string when called on a given file. For now, the comment contains just the epoch date string:
In [8]:
f.comment
Out[8]:
A more machine-readable way to access the date is in the time coordinate variable contents:
In [9]:
d = f.variables['time']
date = d[:]
print(d)
print(date)
Note that this date is encoded as "days since 1972-01-01". To convert the stored number in the time array to a Gregorian date, use the num2date function imported from the netCDF4 package, and use the strftime function to format it as a string however you specify with the formatting string you input to strftime:
In [10]:
greg_date = num2date(date[:],units=d.units,calendar=d.calendar)
print(greg_date)
print(greg_date[0].strftime("%Y-%m-%d"))
print(greg_date[0].strftime("%b %d %Y"))
In [11]:
fig, ax = plt.subplots(1,1, figsize=(12,5))
ax.set_title("%s (%s)" %
(os.path.basename(filename), greg_date[0].strftime("%Y-%m-%d")))
plt.imshow(data, cmap='viridis', vmin=np.amin(data), vmax=np.amax(data), interpolation='nearest')
plt.axis('off')
plt.colorbar(shrink=0.75,label='TB')
plt.tight_layout()
In [12]:
num = np.squeeze(f.variables['TB_num_samples'][:])
In [13]:
fig, ax = plt.subplots(1,1, figsize=(12,5))
ax.set_title("%s (%s)" %
(os.path.basename(filename), greg_date[0].strftime("%Y-%m-%d")))
plt.imshow(num, cmap=plt.cm.gray, vmin=np.amin(num), vmax=np.amax(num), interpolation='nearest')
plt.axis('off')
plt.colorbar(shrink=0.75, label='Num')
plt.tight_layout()
The python package cetbtools contains transformation routines for lat,lon <--> x,y <--> row,col coordinates of locations in the EASE-Grid 2.0 CETB grids.
cetbtools is distributed via the anaconda.org package distribution system. You can install the package on your machine using this command in your local conda env:
conda install -c https://conda.anaconda.org/nsidc cetbtools
This package is currently available for python on mac OS X and linux platforms.
Assuming you have cetbtools installed, you should be able to import it into the notebook:
In [14]:
from cetbtools.ease2conv import Ease2Transform
You can access the help and information about the Ease2Transform class:
In [15]:
help(Ease2Transform)
Getting back to the file we have just opened, in order to get the row, col coordinates of a pixel at a specific location, we can initialize an Ease2Transform object with the long_name of the grid in this file. The initializer for this class need only be called once, and then the "geographic_to_grid" or "grid_to_geographic" methods can be called repeatedly. The initializer takes one input argument, the NSIDC mapx name for the projection and grid that is stored in the crs.long_name attribute we examined earlier:
In [16]:
print(crs.long_name)
In [17]:
grid = Ease2Transform(crs.long_name)
The Ease2Transform method "geographic_to_grid" can be used to transform a (lat, lon) coordinate to (row, col). The returned values for row, col will be real-valued, and need to be rounded (up at 0.5) to get the integer values of row, col that can be used to index into the data arrays:
In [18]:
lat = 40. # latitude in decimal degrees
lon = -95. # longitude in decimal degress
(row, col) = grid.geographic_to_grid(lat, lon)
print(row, col)
(irow, icol) = (int(round(row)), int(round(col)))
print(irow, icol)
In [19]:
print("TB at (40.0N, 95.0W) is: %f K" % data[irow, icol])
print("Num samples used: %d" % num[irow, icol])
Use grid_to_geographic to transform a (row, col) coordinate to (lat, lon):
In [20]:
print(grid.grid_to_geographic(row,col))
Lastly, remember to close the netCDF file:
In [21]:
f.close()
Combining many of the previous steps, I can also read a Northern Hemisphere file, like this:
In [22]:
filename = "NSIDC-0630-EASE2_N3.125km-F08_SSMI-1987305-37H-M-SIR-CSU-v1.2.nc"
f = Dataset(filename, "r", format="NETCDF4")
data = f.variables['TB'][:]
data = np.squeeze(data)
num = f.variables['TB_num_samples'][:]
num = np.squeeze(num)
In [23]:
from mpl_toolkits.axes_grid1 import make_axes_locatable
fig, ax = plt.subplots(1,2, figsize=(12,5))
fig.suptitle("%s (%s)" % (
os.path.basename(filename), greg_date[0].strftime("%Y-%m-%d")))
ax[0].axis('off')
dataAx = ax[0].imshow(data, cmap='viridis', vmin=np.amin(data),
vmax=np.amax(data), interpolation='nearest',
aspect='equal')
divider0 = make_axes_locatable(ax[0])
cax0 = divider0.append_axes("right", size="8%", pad=0.08)
dataCbar = plt.colorbar(dataAx, cax=cax0, label='TB')
ax[0].axis('off')
numAx = ax[1].imshow(num, cmap=plt.cm.gray, vmin=np.amin(num),
vmax=np.amax(num), interpolation='nearest',
aspect='equal')
divider1 = make_axes_locatable(ax[1])
cax1 = divider1.append_axes("right", size="8%", pad=0.08)
numCbar = plt.colorbar(numAx, cax=cax1, label='Num')
ax[1].axis('off')
plt.tight_layout()
plt.subplots_adjust(top=0.95)
plt.show()
In [24]:
f.close()
Feel free to contact me with questions about the data or this tutorial: brodzik_at_nsidc.org.
In [ ]: